Summarization of Multiple

نویسندگان

  • Jade Goldstein
  • Vibhu Mittal
  • Yiming Yang
  • Jan Pedersen
چکیده

In this era, where electronic text information is exponentially growing and where time is a critical resource, it has become virtually impossible for any user to browse or read large numbers of individual documents. It is therefore important to explore methods of allowing users to locate and browse information quickly within collections of documents. Automatic text summarization of multiple documents fulllls such information seeking goals by providing a method for the user to quickly view highlights and/or relevant portions of document collections. As of yet, there has been little work with multi-document summarization, although single document summarization has been a subject of focus in the last few years. Multi-document summarization diiers from single in that the issues of compression, speed, redundancy and passage selection are critical in the formation of useful summaries. If multi-document summarization is to be useful across subject areas and languages, it must be relatively independent of natural language understanding. A statistical approach allows for both rapid passage selection and speed. The maximal marginal relevance (MMR) metric is used to provide \relevant" novelty in passage selection, i.e., selecting passages that meet the criteria of relevance to a query, while reducing redundancy and maximizing diversity among the individual passages. The approach builds on previous work in single-document summarization by using additional , available information about the document set as a whole, the relationships between the documents, as well as properties of individual documents. The underlying framework is modular , thus allowing easy parameterization to take into account diierent document genres or corpora characteristics, user requirements, as well as linguistic properties of languages that can enhance summarization results. The principal question being addressed is "Can multi-document summarization effectively indicate the textual content of document collections and assist users to rapidly nd their desired information?" I will explore this question by evaluating the system in the domains of newswire articles, web pages, and time permitting, computer science technical reports.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A survey on Automatic Text Summarization

Text summarization endeavors to produce a summary version of a text, while maintaining the original ideas. The textual content on the web, in particular, is growing at an exponential rate. The ability to decipher through such massive amount of data, in order to extract the useful information, is a major undertaking and requires an automatic mechanism to aid with the extant repository of informa...

متن کامل

Text Summarization Using Cuckoo Search Optimization Algorithm

Today, with rapid growth of the World Wide Web and creation of Internet sites and online text resources, text summarization issue is highly attended by various researchers. Extractive-based text summarization is an important summarization method which is included of selecting the top representative sentences from the input document. When, we are facing into large data volume documents, the extr...

متن کامل

EXTRACTION-BASED TEXT SUMMARIZATION USING FUZZY ANALYSIS

Due to the explosive growth of the world-wide web, automatictext summarization has become an essential tool for web users. In this paperwe present a novel approach for creating text summaries. Using fuzzy logicand word-net, our model extracts the most relevant sentences from an originaldocument. The approach utilizes fuzzy measures and inference on theextracted textual information from the docu...

متن کامل

Graph Hybrid Summarization

One solution to process and analysis of massive graphs is summarization. Generating a high quality summary is the main challenge of graph summarization. In the aims of generating a summary with a better quality for a given attributed graph, both structural and attribute similarities must be considered. There are two measures named density and entropy to evaluate the quality of structural and at...

متن کامل

مرور مؤثر نتایج جستجوی تصاویر با تلخیص بصری و متنوع از طریق خوشه‌بندی

With unprecedented growth in production of digital images and use of multimedia references, requirement of image and subject search has been increased. Systematic processing of this information is a basic prerequisite for effective analysis, organization and management of it. Likewise, large collections of images have been made available on the Web and many search engines have provided the poss...

متن کامل

Systematic literature review of fuzzy logic based text summarization

Information Overloadrq  is not a new term but with the massive development in technology which enables anytime, anywhere, easy and unlimited access; participation & publishing of information has consequently escalated its impact. Assisting userslq    informational searches with reduced reading surfing time by extracting and evaluating accurate, authentic & relevant information are the primary c...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999